Classification using breast cancer dataset

many of the features have high multicollinearity indicate redundancy and may lead to overfitting,so we are elliminating certain features

Feature selection

The two best features selected

Create KNN model

Accuracy score using feature selection(chi2) and KNN is 96%

Accuracy score using feature selection(ANOVA) and KNN is 93%

PCA diensional reduction technique

PCA is giving the highest accuracy using KNN algorithm 96%

Decision tree model

Random forest model

Random forest model with PCA

XGBoost Classifier model

Conclusion

After analysing all models, I recommend implementing KNN with PCA as it shows high performance without overfitting